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© Secretory expression in eukaryotes. 

© Methods and compositions are provided for producing 
polypeptide sequences in high yield by employing DNA 
constructs, wherein the DNA sequence encoding for the 
polypeptide of interest is preceded by a leader sequence and 
processing sequence for secreting and processing said 
polypeptide. In this manner, the mature polypeptide of 
interest may be isolated from the nutrient medium substan- 
tially free of major amounts of other proteins and cellular 
debris. r 

The yeast strain 5. cerevfs/ae AB103 (pYEGFS) was 
deposited on January 5, 1983, at the A.T.C.C and given 
accession No. 20658. 

The plasmid pYaEGF23 (pAB114-pCV1) was deposited 
at the A.T.C.C. on August 12, 1983, and given Accession No. 
40079. 
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9729-1-1/CCCCC2 
SECRETORY EXPRESSION IN EUKARYOTES 

BACKGROUND OF THE INVENTION 
Field of the Invention 

Hybrid DNA technology has revolutionized the 
ability to produce polypeptides of an infinite variety 
of compositions* Since living forms are composed of 
proteins and employ proteins for regulation, the 
ability to duplicate these proteins at will offers 
unique opportunities for investigating the manner in 
which these proteins function and the use of such 
proteins, fragments of such proteins, or analogs in 
therapy and diagnosis. 

There have^been numerous advances in improv- 
ing the rate and amount of protein produced by a cell. 
Most of these advances have been associated with higher 
copy numbers, more efficient promoters, and means for 
reducing the amount of degradation of the. desired 
product. Is is evident that it would be extremely 
desirable to be able to secrete polypeptides of interest, 
where such polypeptides are the product of interest. 

Furthermore, in many situations, the polypep- 
tide of interest does not have an initial methionine 
amino acid. This is usually a result of there being a 
processing signal in the gene encoding for the polypep- 
tide of interest, which the gene source recognizes and 
cleaves with an appropriate peptidase. Since in most 
situations, genes of interest are heterologous to the 
host in which the gene is to be expressed, such proces- 
sing occurs imprecisely and in low yield in the expres- 
sion host. In this c<ase, while the protein which is 
obtained will be identical to the peptide of interest 
for almost all of its sequence, it will differ at the 
N^.terminus which can deleteriously affect physiological 
activity. 
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There are, therefore, many reasons why it 
would be extremely advantageous to prepare DNA se- 
quences, which would encode for the secretion and 
maturing of the polypeptide product. Furthermore, 
where sequences can be found for processing, which 
result in the removal of amino acids superfluous to the 
polypeptide of interest, the opportunity exists for 
having a plurality of DNA sequences, either the same or 
different, in tandem, which may be encoded on a single 
transcript. 

Description of the Prior Art 

U.S. Patent No. 4,336,336 describes for pro- 
karyotes the use of a leader sequence coding for a non- 
cytoplasmic protein normally transported to or beyond 
the cell surface, resulting in transfer of the fused 
protein to the periplasmic space. U.S. Patent No. 
4,338,397 describes for prokaryotes using a leader 
sequence which provides for secretion with cleavage of 
the leader sequence from the polypeptide sequence of 
interest. U.S. Patent No. 4,338,397, columns 3 and 4, 
provide for useful definitions , which definitions are 
incorporated herein by reference. 

Kurjan and Herskowitz, Cell (1982) 30:93^-943 
describes a putative a-factor precursor containing four 
tandem copies of mature a-factor, describing the 
sequence and postulating a processing mechanism. 
Kurjan and Herskowitz, Abstracts of Papefcs presented at 
the 1981 Cold Spring Harbor meeting oil The Molecular 
Biology of Yeasts, page 242, in an Abstract entitled, 
"A Putative a -Factor Precursor Containing Four Tandem 
Repeats of Mature a -Factor, M describe the sequence 
encoding for the o-f actor and spacers between two of 
such sequences. Blair et al.. Abstracts of Papers, 
ibid , page 243, in an Abstract entitled "Synthesis and 
Processing of Yeast: Pheremones: Identification and 
Characterization of Mutants That Produce Altered o- 
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Factors," describe the effect of various mutants on 
the production of mature a- factor. 

SUMMARY OF THE INVENTION 
Methods and compositions are provided for 
5 producing mature polypeptides. DNA constructs are 
provided which join the DNA fragments encoding for a 
yeast leader sequence and processing signal to heterolo- 
gous genes for secretion and maturation of the poly- 
peptide product. The construct of the DNA encoding for 
10 the N-terroinal cleavable oligopeptide and the DNA 

sequence encoding for the mature polypeptide product 
can be joined to appropriate vectors for introduction 
into yeast or other cell which recognizes the processing 
z -^ signals for production of the desired polypeptide. 
15 Other capabilities may also be introduced into the 
construct for various purposes. 

BRIEF DESCRIPTION OF TEE DRAWINGS 
Fig. 1 is a flow diagram indicating the 
construction of pYcrEGF-21. 
20 Fig. 2 shows sequences at fusions of hEGF to 

the vector, a. through e. show the sequences at the 
N- terminal region of hEGF, which differ among several 
constructions and f . shows the C- terminal region of 
hEGF. 

25 DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

In accordance with the subject invention, 
eukaryotic hosts, particularly yeast are employed for 
the production , of mature polypeptides/ where such 
polypeptides may be harvested from a nutrient medium. 

30 The polypeptides are produced by employing a DNA 

construct encoding for yeast leader and processing 
signals joined to a polypeptide of interest, which may 
be a single polypeptide or a plurality of polypeptides 
separated by processing signals. The resulting 
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construct encodes for a pre-pro-polypeptide which will • 
contain the signals for .secretion of the pre-pro- 
polypeptide and processing of the polypeptide, either 
intracellularly or extracellularly to the mature 
polypeptide* 

The constructs of the subject invention will for example 
have at least the following formula defining a pro- 
polypeptide: 

( (R) r - (GWCYOO n -Gene* ) y 

wherein: 

R is CGX or AZZ, the codons coding for lysine 
and arginine, each of the Rs being the same or different; 

r is an integer of from 2 to 4, usually 2 to 
3, preferably 2 or 4; 

X is any of the four nucleotides, T, G, C, or 

A; 

Y is G or C; 

y is an integer of at least one and usually 
not more than 10, more usually not more than four, 
providing for monomers and mul timers; 

Z is A or G; and 

Gene* is a gene other than a-f actor, usually 
foreign to a yeast host, usually a heterologous gene, 
desirably a plant or mammalian gene; 

n is 0 or an integer which will generally 
vary from 1 to 4, usually 2 to 3. 

The pro-polypeptide has an N- terminal proces- 
sing signal for peptidase removal of the amino acids 
preceding the amino acids encoded for by Gene*- 

For the most part, the constructs of the 
subject invention will have at least the following 
formula: 

L~(R-S-(GAXycX) n )-Gene*) y 
defining a pre-pro-polypeptide, wherein all 
the symbols except L and S have been 4efined, S having 
the same definition as R, there being 1R and IS, and L 
is a leader sequence providing for secretion of the 
pre-pro-polypeptide* While it is feasible to have more 
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(Tr ) a -L- (R-S ) r „ - ( GAXYCX ) n „GA \ AGCT j 

wherein: 

all of the symbols previously defined have 
the same definition; 
5 a is 0 or 1 intending that the construct may 

or may not have the transcriptional and translational 
signals ; 

the nucleotides indicated in the broken box 
are intended not to be present but to be capable of 
10 addition by adding an Hind i 1 1 cleaved terminus to 

provide for the recreation of the sequence encoding for 
a dipeptide; and 

n lr will be 0 to 2, where at least one of the 
Xs and Ys defines a nucleotide, so that the sequence in 
15 the. .parenthesis is other than the sequence GAAGCT. 

The coding sequence of Gene* may be joined to 
the terminal T, providing that the coding sequence is 
in frame with the initiation codon and upon, processing 
the first amino acid will be the correct amino acid for 
20 the mature polypeptide. 

The 3 1 - terminus of Gene* can be, manipulated 
much more easily and, therefore, it is desirable to 
provide a construct which allows for insertion of Gene* 
into a unique restriction site in the construct. Such 
25 a construct would provide for a restriction site with 
insertion of the Gene* into the restriction site to be 
in frame with the initiation codon. Such a construction 
can be symbolized as follows: 

( Tr ) a -L- (R-S) r „-( GAXYCX ) n „ -W-(SC ) b ~Te 

30 wherein: 

those symbols previously defined have the 
same definition; 

SC are stop codons; 

Te is a termination sequence balanced with 
35 the promoter Tr, and may include other signals, e,g. 
polyadenylation; and 
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b is an integer which will generally vary 
from about 0 to 4, more usually from 0 to 3, it being 
understood, that Gene* may include its own stop codonso 

Illustrative of a sequence having the above 
formula is where W is the sequence GA and n" is 2o 

Of particular interest is where the sequence 
encoding the terminal dipeptide is taken together with 
W to define a linker or connector, which allows for 
recreation of the terminal sequence defining the 
dipeptide of the processing signal and encodes for the 
initial amino acids of Gene*, so that the codons are in 
frame with the initiation codon of the leader <> The 
linker provides for a staggered or butt ended termina- 
tion, desirably defining a restriction site in conjunc- 
tion with the "successive sequences of the Gene*o Upon 
ligation of the linker with Gene*, the codons of Gene* 
will be in frame with the initiation codon of the 
leader o In this manner, one can employ a synthetic 
sequence which may be joined to a restriction site in 
the processing signal sequence to recreate the proces- 
sing signal, while providing the initial bases of the 
Gene* encoding for the H- terminal amino acids <, By 
employing a synthetic sequence, the synthetic linker 
can be a tailored connector having a convenient restric- 
tion site near the 3 8 -terminus and the synthetic 
connector will then provide for the necessary codons 
for the 5 "-terminus of the gene* 

Alternatively , one could introduce a restric- 
tion endonuclease recognition site downstream from the 
processing signal to allow for cleavage- and removal of 
superfluous bases to provide for ligation of the Gene* 
to the processing signal in frame with the initiation 
codon o Thus the first codon would encode for the 
N- terminal amino acid of the polypeptide <> ^Jhere T is 
the first base of Gene*, one could introduce a restric- 
tion site where the recognition sequence is downstream 
from the cleavage site- For example, a Sau3 & recogni- 
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tion sequence could be introduced immediately after the 
processing -signal, which would allow for cleavage ahd. 
linking of the Gene* with its initial codon in frame 
with the leader initiation codon. With restriction 
5 endonucleases which have the recognition sequence 

distal and downstream from the cleavage site e/g;''HgaX, 
W could define such sequence which could include a 
portion of the processing signal sequences <, Other 
constructions can also be employed, employing such 
10 techniques as primer repair and in vitro mutagenesis to 
provide for the convenient insertion of Gene* into the 
construct by introducing an appropriate restriction 
site« 

The construct provides a portable sequence 

15 for insertion into vectors, which provide the desired 
replication system . As already indicated, in some 
instances, it may be desirable to replace the wild type 
promoter associated with the leader sequence with a 
different promoter « In yeast, promoters involved with 

20 enzymes in the glycolytic pathway can provide for high 
rates of transcription o These promoters are associated 
with such enzymes as phosphoglucoisomerase* , phos- 
phofructokinase, phosphotriose isomerase, phbspho- 
glucomutase, enolase, pyruvic kinase, glyceraldehyde-3- 

25 phosphate dehydrogenase, and alcohol dehydrogenase « 
These promoters may be inserted upstream from the 
leader sequence • The 5* -flanking region to the leader 
sequence may be retained or replaced with the 
sequence of the alternative promoter • Vectors can be 

30 prepared and have been reported which include promoters 
having convenient restriction sites downstream from the 
promoter for insertion of such constructs as described 
above* 

The final construct will be an episomal 
35 element capable of stable maintenance in a host, 

particularly a fungal host such as yeast o The construct 
will include one or more replication systems, desirably 
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two replication systems, allowing for mainte n ance in 
the expression host and cloning in a prokaryote. In 
addition, one or more markers for selection will be 
included, which will allow for selective pressure for 
maintenance of the episomal element in the host. 
Furthermore, the episomal element may be a high or low 
copy number, the copy number generally ranging from 
about 1 to 200. With high copy number episomal elements, 
there will generally be at least 10, preferably at 
least 20, and usually not exceeding about 150, more 
usually not exceeding about 100 copy number. Depending 
upon the Gene*, either high or low copy numbers may be 
desirable, depending upon the effect of the episomal 
element on the host. Where the presence of the expres- 
sion product of the episomal element may have a dele- 
terious effect on the viability of the host, a low copy 
number may be indicated. 

Various hosts may be employed, particularly 
mutants having desired properties. It should be 
appreciated that depending upon the rate of production 
of the expression product of the construct, the pro- 
cessing enzyme may or may not be adequate for process- 
ing at that level of production. Therefore, a mutant 
having enhanced production of the processing enzyme may 
be indicated or enhanced production of the enzyme may 
be provided by means of an episomal element. Generally, 
the production of the enzyme should be of a lower order 
than the production of the desired expression product. 

Where one is using a-factor for secretion and 
processing, it would be appropriate to provide for 
enhanced production of the processing enzyme Dipeptidyl 
Amino Peptidase A, which appears to be the expression 
product of STE13 . This enzyme appears to be specific 
f or x-Ala- and X-Pro -sequences , where X in this instance 
intends an amino acid, particularly, the dicarboxylic 
acid amino acids. 
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Alternatively, there may be situations where 
intracellular processing is not desired. In this 
situation, it would be useful to have a ste!3 mutant, 
where secretion occurs, but the product is not pro- 
cessed. In this manner, the product may be subse- 
quentally processed in vitro . 

Host mutants which provide for controlled 
regulation of expression may be employed to advantage. ■ • 
For example, with the constructions of the subject 
invention where a fused protein is expressed, the 
trans formants have slow growth which appears to be a 
result of toxicity of the fused protein. Thus, by 
inhibiting expression during growth, the host may be 
grown to high density before changing the conditions to 
permissive conditions for expression. 

A temperature-sensitive sir mutant may be 
employed to achieve regulated expression. Mutation in 
any of the SIR genes results in a non-mating phenotype 
due to in situ expression of the normally silent MATa 
and MATa sequences present at the HML and HMR loci. 

Furthermore, as already indicated, the Gene* 
may have a plurality of sequences in tandem, either the 
same or different sequences, with intervening processing 
signals. In this manner, the product may be processed 
in whole or in part, with the result that one will 
obtain the various sequences either by themselves or in 
tandem for subsequent processing. In many situations, 
it may be desirable to provide for different sequences, 
where each of the sequences is a subunit of a particular 
protein product. 

The Gene* may encode for any type of polypep- 
tide of interest. The polypeptide may be as small as 
an oligopeptide of 8 amino acids or may be 100,000 
daltons or higher. Usually, single chains will be less 
than about 300,000 daltons, more usually less than 
about 150,000 daltons. Of particular interest are 
polypeptides of from about 5,000 to 150,000 daltons. 
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©ore particularly of about 5,000 to 100,000 daltonso 
Illustrative polypeptides of interest include hormones 
and factors, such as growth hormone, somatomedins 
epidermal growth factor, the endocrine secretions, such 
as luteinizing hormone, thyroid stimulating hormone, 
oxytocin, insulin, vasopressin, renin, calcitonin, 
follicle stimulating hormone, prolactin, etc? hemato- 
poietic factors, e.go erythropoietin, colony stimulating 
factor, etc,? lymphokines ? globins? globulins, e*g ; 
immunoglobulins? albumins? interferons, such as or, p 
and y? repressors? enzymes? endorphins e ft g<> p -endorphin, 
enkephalin, dynorphin, etc* 

Having prepared the episomal elements con- 
taining the constructs of this invention, one may then 
introduce such element into an appropriate host* The 
manner of introduction is conventional, there being a 
wide variety of ways to introduce DMA into a hosto 
Conveniently, spheroplasts are prepared employing the 
procedure of, for example, Hinnen et al o , PNAS USA 
(1978) 75 sl919=1933 or Stinchcomb et al „ , EP 0 045 573 
A2o The trans formants may then be grown in an appro- 
priate nutrient medium and where appropriate, maintaining 
selective pressure on the trans formants „ Where expres- 
sion is inducible, one can allow for growth of the 
yeast to high density and then induce expression . In 
those situations, where a substantial proportion of the 
* product may be retained in the periplasmic space, one 
can release the product by treating the yeast cells 
with an enzyme such as zymolase or lyticase* 

The product may be harvested by any conve- 
nient means, purifying the protein by chromatography, 
electrophoresis, dialysis, solvent-solvent extraction, 
etc- 

In accordance with the subject invention, one 
can provide for secretion of a wide variety of polypep- 
tides, so as to greatly enhance product yield, simplify 
purification, minimize degradation of the desired 
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product/ and simplify processing, equipment, and 
engineering requirements. Furthermore/ utilization of 
nutrients based on productivity can be greatly enhanced/ 
so that more economical and more efficient production 
of polypeptides may be achieved. Also, the use of 
yeast has many advantages in avoiding enterotoxins , 
which may be present with prokaryotes, and employing 
known techniques , which have been developed for yeast 
over long periods of time, which techniques include 
isolation of yeast products. 

The following examples are offered by way of 
illustration and not by way of limitation. 

EXPERIMENTAL 
A synthetic sequence for human epidermal 
growth factor (EGF) based on the amino acid sequence of 
EGF reported by H. Gregory and B.M. Preston Int* J. 
Peptide Protein Res. 9, 107-118 (1977) was prepared/ 
which had the following sequence. 

5 1 AACTCCGACTCCGAATGTCCATTGTCCCACGACGGTTACTGTTTGCACGACGGTGTTTGT .- 
3 1 TTGAGGCTGAGGCTTACAGGTAACAGGGTGCTGCCAATGACAAACGTGCTGCCACAAACA 

ATGTACATCGMGCTTTGGACAAGTACGOT 

TACATGTAGCTTCGAMCCTGTTCATGCGAACATTGACACAACMCCAATGTAGCCACTT 

AGATGTCAATACAGAGACTTGAAGTGGTGGGAATTGAGATGA 
TCTACAGTTATGTCTCTGAACTTCACCACCCTTAACTCTACT , 

where 5 1 indicates the promoter proximal end of the 
sequence. The sequence was inserted into the EcoRI 
site of pBR328 to produce a plasmid p328EGF-l and 
cloned. 

Approximately 30pg of p328EGF-l was digested 
with EcoR I and approximately lpg of the expected 190 
base pair EcoR I fragment was isolated. This was 
followed by digestion with the restriction enzyme Hga l . 



0116201 



14 

Two synthetic oligonucleotide connectors Hind i I I -Hga l 
and Hgal -Sail were then ligated to the 159 base pair 
Hgal fragment. The Hgal-Hindlll linker had the following 
sequence: 

AGCTGAAGCT 

CTTCGATTGAG 

This linker restores the a-factor processing signals 
interrupted by the Hindi II digestion and joins the Hgal 
end at the 5' -end of the EGF gene to the Hind i 1 1 end 
of pAB112. 

The Hga l - Sal I linker had the following 

sequence : 

TGAGATGATAAG 

ACTATT CAGCT 

This linker has two stop codons and joins the Hgal end 
at the 3 ' -end of the EGF gene to the Sai l end of 
pAB112 . 

The resulting 181 base pair fragment was 
purified by preparative gel electrophoresis and ligated 
to lOOng of pAB112 which had been previously completely 
digested with the enzymes Hind i 1 1 and Sai l. Surprisingly, 
a deletion occurred where the codon for the 3rd and 4th 
amino acids of EGF, asp and ser, were deleted, with the 
remainder of the EGF being retained. 

pAB112 is a plasmid conta inin g a 1.75kb EcoR I 
fragment with the yeast a-factor gene cloned in the 
EcoRI site of pBR322 in which the Hindlll and Sail 
sites had been deleted (pABll) . pAB112 was derived from 
plasmid pABlOl which contains the yeast a-factor gene 
as a partial Sau3A fragment cloned in the BamH I site of 
plasmid YEp24. pABlOl was obtained by screening a 
yeast genomic library in YEp24 using a synthetic 20-mer 
Oligonucleotide probe (3 ' -GGCCGGTTGGTTACATGATT-5 1 ) 
homologous to the published a-factor coding region 
(Kurjan and Herskowitz, Abstracts 1981 Cold Spring 
Harbor meeting on the Molecular Biology of Yeasts, 
page 242). 
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The resulting mixture was used to transform ' 
E e eoli HB101 cells and plasmid. pAB201 obtained. / 
Plasmid pAB201 CSpg) was -digested to completion with- 
the enzyme EeoRS and the resulting -fragments "were 8 ■ 
5 a)- filled in with DMA polymerase I Klenow fragment?' * 

b) . ligated t© an excess ©f 3'amH Z linkers? : ®nd * ■ ' 

c) digested with BaanH Io the 2Lo7§kbp BcoR l fragment -was 
isolated by preparative gel electrophoresis and . 
approximately lOOng of the fragment was ligated' to 

10 lOOng of pCl/1, which had' been previously digested to 
completion with the restriction enssyme Bamfi l and 
treated with alkaline phosphatase- 

Plasmid pCl/1 is a derivative of pJDB219/ 
Beggs, Nature (1978) 275 s 104, in which the region 
15 corresponding to bacterial plasmid pMB9 in pJDB219 has 
been replaced by pBR322 in pCl/l« This mixture was 
used to transform So coli HB101 cellSo Transformants 
were selected by ampicillin resistance' and their' 
plasmids analysed by restriction endonucleases 0 BH& 
20 from one selected clone (pYEGF~8 } was prepared and used 
to transform yeast AB103 cells • Transformants were 
selected by thedr leu**" phenotypeo 

Fifty milliliter cultures of yeast strain 
&B103 (g b pep 4-3, leu 2-3, leu 2- 112 , ura 3-52/ his 
25 4- 580 ) transformed with plasmid pYEGF-8 (deposited at 
the American Type Culture Collection on 5th January 
1983 and given ATCC Accession no B 20658) were grown at 
. . 30® in -leu medium to saturation (optical density at 
600nm of 5) and left shaking at 30° for an additional 
30 12 hr period • Cell supernatants were collected by 

centrifugation and analyzed for the presence of human 
EGF using the fibroblast receptor competition binding 
; . • assay. The assay of EGF .is based on the ability of 

. ...both mouse and human EGF to compete with 125 X-labeled 
35- mouse EGF for binding sites on human foreskin fibro« 
.. blasts o Standard curves can be obtained by measuring 
the effects of increasing quantities of EGF on the bind- 
xng of a standard amount of I^labeled mouse EGFo 
Under these conditions 2 to 20 ng of EGF are readily 
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measurable. Details on the binding of I -labeled 
epidermal growth factor to human fibroblasts have been 
described by Carpenter et al., J. Biol . Chem. 250 , 4297 
(1975). Using this assay it is found that the culture 
medium contains 7±lmg of human EGF per liter. 

For further characterization, human EGF 
present in the supernatant was purified by absorption 
to the ion-exchange resin Biorex-70 and elution with 
HC1 lOmM in 80% ethanol. After evaporation of the HCl 
and ethanol the EGF was solubilized in water. This 
material migrates as a single major protein of MW 
approx. 6,000 in 17.5% SDS gels, roughly the same as 
authentic mouse EGF (MW~6,000). This indicates that 
the a- factor leader sequence has been properly excised 
during the secretion process. Analysis by high resolu- 
tion liquid chromatography (microbondapak C18, Waters 
column) indicates that the product migrates with a 
retention time similar to an authentic mouse EGF 
standard. However, protein sequencing by Edman degrada- 
tion showed that the N-terminus retained the glu-ala 
sequence. 

A number of other constructions were prepared 
using different constructions for joining hEGF to the 
a- factor secretory leader sequence, providing for 
different processing signals and site mutagenesis. In 
Fig* 2 a. through e. show the sequence of the fusions at 
the N- terminal region of hEGF f which sequence differ 
among several constructions, f. shows the sequences at 
the C- terminal region of hEGF, which is the same for all 
constructions. Synthetic oligonucleotide linkers used 
in these constructions are boxed. 

These fusions were made as follows. Construc- 
tion (a) was made as described above. Construction (b) 
was made in a similar way except that linker 2 was used 
instead of linker 1. Linker 2 modifies the a- factor 
processing signal by inserting an additional processing 
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site (ser~leu»asp-lys-arg) immediately preceding the 
faEGF gene. The resulting yeast plasmid is named 
pYcrEGF-22o Construction (c), in which the dipeptidyl 
aminopeptidase maturation site (glu-ala) has been removed, 
was obtained by in vitro mutagenesis of construction 
(a). A Pst 5> Sal I fragment containing the factor 
leader-h£GF fusion was cloned in phage M13 and isolated 
in a single-stranded form,, A synthetic 31-mer of 
sequence 5 fi «TCTTTGGATAAAAGAAACTCCGACTCCCG°3 9 was 
synthesized and 70 picomoles were used as a primer for 
the synthesis of the second strand from 1 picomole of 
the above template by the Klenow fragment of DNA 
polymerase o After fill-in and ligation at 14° for 18 
hrs, the mixture was treated with S^ nuclease (5 units 
for 15 min) and used to transfect L coli JM101 cells. 
Bacteriophage containing DNA sequences in which the 
region coding for (glu-ala) was removed were located by 
filter plaque hybridization using the P-labeled 
primer as probe . RF DMA from positive plaques was 
isolated, digested with PstI and Sai l and the resulting 
fragment inserted in pAB114 which had been previously 
digested to completion with Sai l and partially with 
PstI and treated with alkaline phosphatase . 

The plasmid pAB114 was derived as follows s 
plasmid pAB112 was digested to completion with Hin di 1 1 
and then religated at low (4yg/ml) DNA concentration 
and plasmid pAB113 was obtained in which three 6 3 bp 
Hind i 1 1 fragments have been deleted from the a-f actor 
structural gene, leaving only a single copy of mature 
m -factor coding region . A BamB I site was added to 
plasmid pABll by cleavage with EcoRI , filling in of the 
overhanging ends by the Klenow fragment of DNA 
polymerase, ligation of BamB I linkers, cleavage with 
BamB I and religation to obtain pAB12o Plasmid pAB113 
was digested with EcoR I , the overhanging ends filled 
in, and ligated to BamBI linkers • After digestion with 
BamB I the 1500bp fragment was gel-purif ied and ligated 
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to pAB12 which had been digested with BamHI and treated 
with alkaline phosphatase • Plasmid pABH4„ which 
contains a XSOObp BamHI fragment carding the ^-factor 
gene, was obtained* The resulting plasmid (pAB114 
5 containing the above described construct) is then 
digested with BamHI and ligated into plasmid pCl/lo 
1 The resulting yeast plasmid is named pY^EGF-23 

and was deposited at the American Type Culture Collection 
on 12th August 1983 under ATCC Accession no • 40079c 

10 Construction (d), in which a new Kpn l site 

was generated , was made as described for construction 
(c) except that the 36-mer oligonucleotide primer of 
sequence 5 tt -GGGTACCTTTGGATAAAAGAAACTCCGACTCCGAAT-3 ff was 
usedo The resulting yeast plasmid is named pYcrBGF-24 0 

15 Construction (e) was derived by digestion of the 

plasmid containing construction (d) with KgnX and" Sai l 
instead of linker 1 and 2« The resulting yeast plasmid 
is named pTaEGF-25<, 

Yeast cells transformed with pYcrEGF-22 were 

20 grown in 15 ml cultures * At the indicated densities or 
times, cultures were centrifuge d and the supematants 
saved and kept on ice- The cell pellets were washed in 
lysis buffer (CKX Triton X-100, lOmM HaHP0 4 pH 7.5]) and 
broken by vortexing (5min in Imin intervals with 

25 cooling on ice in between) in one volume of lysis 

buffer and one volume of glass beads o After centrif uga- 
tion, the supematants were collected and kept on ice<> 
The amount of hEGF in the culture medium and cell 
extracts was measured using the fibroblast receptor 

30 binding competition assay* Standard curves were 

obtained by measuring the effects of increasing quan- 
tities of mouse EGF on the binding of a standard amount 
^^I-labeled mouse EGF* 

Proteins were concentrated from the culture 
35 media by absorption on Bio«Sess 70 resin and elution 
with OoOl HC1 in 80% ethanol and purified by high 
performance liquid chromatography (HPLC) on a reverse 
phase CIS column o She column was eluted at a flow rate 
of 4al/min with a linear gradient of 5% to 80% aceto- 



0116201 



nitrile containing 0.2% trifluoroacetic acid in 60min. 
Proteins (200-800 picomoles) were sequenced at the 
amino-terminal end by the Edroan degradation method 
using a gas-phase protein sequencer Applied Biosystems 
5 model 470A. The normal PROTFA program was used for all 
the analyses. Dithiothreitol was added to S2 (ethyl 
acetate: 20mg/liter) and S3 (butyl chloride: lOmg/liter) 
immediately before use. All samples were treated with 
IN HC1 in methanol at 40° for 15min to convert PTH- 
10 aspartic acid and PTH-glutamic acid to their methyl 
esters. All PTH-amino acid identifications were 
performed by reference to retention times on a IBM CN 
HPLC column using a known mixture of PTH-amino acids as 
standards . 

15 Secretion from pYotEGF-22 gave a 4:1 mole 

ratio of native N-terminus hEGF to glu-ala terminated 
hEGF, while secretion from pYaEGF- 23-25 gave only 
native N- terminated hEGF. Yields of hEGF ranged from 5 
to 8jjg/ml measured either as protein or in a receptor 

20 binding assay. • 
The strain JRY188 (MAT sir3 -8 leu2- 3 leu2 - 112 
trpl ura3 his4 rme ) was transformed with pYaEGF-21 and 
leucine prototrophs selected at 37°. Saturated 
. cultures were then diluted 1/100 in fresh medium and 

2p grown in leucine selective medium at permissive (24°) 
and non-permissive (36°) temperatures and culture 
supernatants were assayed for the presence of hEGF as 
described above. The results are shown in the 
following table. 
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Regulated synthesis and secretion of hEGF in transformed 
yeast sir3 temperature-sensitive mutants. 



Temperature 


Trans f ormant 


O.D.650 


bEGF(ug/ml) 


36° 


3a 


3.5 
5.4 


0.010 
0.026 




3b 


3.6 
6.4 


0.020 
0.024 


24° 


3a 


0.4 
1.3 
2.1 
4.0 


34 
145 
1075 
3250 




3b 


0.4 
1.4 
2.2 
4.2 


32 
210 
1935 
4600 



These results indicate that the hybrid 
a-factor/EGF gene is being expressed under mating type 
regulation, even though it is present on a high copy 
number plasmid. 

In accordance vith the subject invention, 
novel constructs are provided which may be inserted 
into vectors to provide for expression of polypeptides 
having an N-terminal leader sequence and one or more 
processing signals to provide for secretion of the 
polypeptide as well as processing to result in a mature 
polypeptide product free of superfluous amino acids. 
Thus, one can obtain a polypeptide having the identical 
sequence to a naturally occurring polypeptide. In 
addition, because the polypeptide can be produced in 
yeast f glycosylation can occur, so that products can be 
obtained which are identical to the naturally occurring 
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products. Furthermore, because the product is secreted, 
greatly enhanced yields can be obtained based on cell 
population and processing and purification are greatly 
simplified. In addition, employing mutant hosts, 
5 expression can be regulated to be turned off or on, as 
desired. 

Although the foregoing invention has been 
described in some detail by way of illustration and 
example for purposes of clarity of understanding, it 
10 will be obvious that certain changes and modifications 
may be practiced within the scope of the appended 
claims. 
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CLAIMS 

1. A DNA construct encoding a pre-pro-poly- 
peptide, said DNA construct encoding pre-pro-polypeptide 
comprising a yeast leader sequence, processing signals 
for processing the pre-pro-polypeptide to a mature poly- 
peptide and a gene encoding a polypeptide other than the 
wild type gene associated with said leader sequence. 

2. A DNA construct according to Claim 1, 
including at the 5* end of the sequence a yeast pro- 
moter and wherein said gene is heterologous to said 
yeast host, 

3. A DNA construct according to Claim 2, 
wherein said yeast promoter is the or- factor promoter 
and said yeast leader is a leader sequence encoding for 
at least a major portion of the a -factor leader and is 
capable of providing for secretion* 

4. A DNA construct according to Claim 2, 
wherein said gene is a mammalian gene. 

5. A DNA construct comprising a sequence of 
the following formula: 

L- ( (R) r - (GAXYCX) n -Gene* ) y 

wherein: 

L is a leader sequence recognized by yeast 
for secretion; 

R is a codon coding for arginine or lysine? : 

r is an integer of from 2 to 4; 

X is any nucleotide; 

Y is guano sine or cytosine; 

y is an, integer of from about 1 to 10; 

Gene* is a gene foreign to yeast; and 

n is 0 or 1 to 4. 
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6. A DNA construct according to Claim 5, 
wherein n is 0 to 4 and the nucleotides of said Gene* 
proximal to R at least in part define a recognition 
site for a restriction endonuclease . 

5 7, A DNA construct according to Claim 6, 

wherein said leader sequence is the a -factor leader 
sequence • 

8. A DNA construct according to Claim 7, 
wherein n is 0. 

10 9. A DNA construct of the formula: 

Tr-L- (R-S-( GAXYCX ) n f -W- ( Gene* ) d ) 

wherein: 

Tr is a sequence having transcriptional and 
translational regulatory signals for initiation and 
15 processing of transcription and translation, wherein 
said regulatory signals are recognized by yeast; 

L is a leader sequence for secretion by 

yeast; 

R and S are codons expressing arginine and 

20 lysine; 

X is any nucleotide; 

Y is cytosine or guanosine; 

y is an integer of from 1 to 4; 

n* is a whole number of from 0 to 4; 
25 W is a deoxyribosyl-3 1 group or when n* is 

other than 0, one or more nucleotides which by themselves 
or together with the hexanucleotide in the parenthesis 
define a restriction site; 

Gene* either by itself or taken together with 
30 W defines a polypeptide sequence foreign to yeast; and 

d is 0 or 1, being 1, when y is greater than 

1. 
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10. A DNA construct according to Claim 9, 
wherein Tr is a sequence defining the regulatory 
signals for a- factor, d is 1 and Gene* and W are taken 
together to define a polypeptide foreign to yeast. 

11. A DNA construct according to Claim 9 
wherein n» is 0. 

12. A DNA construct according to Claim 11, 
wherein said polypeptide product is a mammalian poly- 
peptide . 

13. A DNA construct comprising a sequence of 
the formula: 

(Tr) -It-R-S— (GAXTCX) M GA| AGCTj 

cL IX — 1,11 

wherein: 

Tr is a sequence defining transcriptional and 
translational regulatory signals for initiation and 
processing of transcription and translation recognized 
by yeast; 

a is 0 or 1; 

L is a leader sequence recognized by yeast; 
R and S are codons encoding for lysine and 

arginine; 

X is any nucleotide; 

Y is cytosine or guano sine; 

n" is 2 to 4; 

the nucleotides in the broken box indicate 
the nucleotides which are complementary to the overhang 
of the non-coding chain to define a Hind i 1 1 restriction 
site. 
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14. A DNA construct according to Claim 13, 
where Tr is a sequence defining the transcriptional and 
translational regulatory signals of a-f actor. 

15. An expression episomal element compris- 
ing a replication system for providing stable mainte- 
nance in yeast and a sequence of the formula: 

Tr-L- ( R ) r , - ( GAXYCX ) n t -W-Te 

wherein: 

Tr is a sequence defining transcriptional and 
translational regulatory signals for initiation and 
processing of transcription and translation in yeast; 

L is a leader sequence recognized by yeast 
for secretion; 

R is a codon defining arginine or lysine; 

r' is a whole number in the range of 2 to 4; 

X is any nucleotide; 

Y is cytosine or guano sine; 

n 1 is a whole number in the range of 0 to 4; 

W is a nucleotide sequence of at least 1 
nucleotide, which by itself or when n 1 is other than 0, 
in conjunction with nucleotides in the parenthesis 
defines a restriction site; 

Te is a sequence defining a terminator 
balanced with said transcriptional initiator sequence. 

16* An expression episomal element according 
to Claim 15 wherein Tr is derived from a -factor and n 1 
is 2 to 3. 

17. An expression episomal element according 
to Claim 14, wherein Tr is derived from a -factor and n f 
is 0. 



18. An episomal expression vector according 
to Claim 17 # having a gene foreign to yeast intermediate 
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R and Te and in reading frame with the initiation codon 
of L. 

19. An episomal expression element according 
to Claim 18 , wherein said gene is a mammalian gene. 

20. An episomal element according to Claim 
19 , wherein said mammalian gene is human epidermal 
growth factor. 

21. An episomal expression vector according 
to Claim 16, having a gene foreign to yeast intermediate 
the nucleotides in the parentheses and Te and in 
reading frame with the initiation codon of L. 

22 . A method for producing a polypeptide 
foreign to yeast and having such polypeptide secreted 
into the culture medium, said method comprising: 

growing yeast containing an episomal expres- 
sion elements according to Claim 16 , whereby the 
encoding sequences are expressed to produce a pre-pro- 
polypeptide ; and 

said pre-pro-polypeptide is at least partially 
processed and secreted. 

23. An episomal expression vector according 
to Claim 17/ having a gene foreign to yeast intermediate 
the nucleotides in the parentheses and Te and in 
reading frame with the initiation codon of L. 

24. A method for producing a polypeptide 
foreign to yeast and having such polypeptide secreted 
into the culture medium, said method comprising: 

growing yeast mutants containing an episomal 
expression element according to Claim 16, wherein said 
mutant permits external regulation of expression, 
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whereby the encoding sequences are expressed to produce 
a pre-pro-polypeptide under permissive conditions; and 

said pre-pro-polypeptide is at least partially 
processed and secreted. 

5 25. A method according to Claim 24 , wherein 

said mutant yeast is a temperature-sensitive sir 
mutant. 

26. A method for producing a polypeptide 
foreign to yeast and having such polypeptide secreted 
10 into the culture medium, said method comprising: 

growing yeast mutants containing an episomal 
expression element according to Claim 17, wherein said 
mutant permits external regulation of expression, 
whereby the encoding sequences are expressed to produce 
15 a pre-pro-polypeptide under permissive conditions; and 

said pre-pro-polypeptide is at least partially 
processed and secreted. 
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27. A method according to Claim 26 , wherein 
said mutant yeast is a temper ature- sensitive sir 
mutant. 
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T4 DNA ligase 
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1. EcoRI 

2. DNA polymerase 
(Klenow) 

3. BamHI linkers 

4. BamHI 

5. T4 DNA ligase 




Linker-1 5-A6CTGAAGCT-3 

3-CTTCGATTGAG-5 
Linker-4 5-TGAGATG ATAAG- 3 

3-ACT AT TCAGCT-5 
BamHI — | 
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